Effect of initial configuration on network-based recommendation 
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In this paper, based on a weighted object network, we propose a recommendation algorithm, 
which is sensitive to the configuration of initial resource distribution. Even under the simplest case 
with binary resource, the current algorithm has remarkably higher accuracy than the widely applied 
global ranking method and collaborative filtering. Furthermore, we introduce a free parameter /3 
to regulate the initial configuration of resource. The numerical results indicate that decreasing the 
initial resource located on popular objects can further improve the algorithmic accuracy. More 
significantly, we argue that a better algorithm should simultaneously have higher accuracy and be 
more personal. According to a newly proposed measure about the degree of personalization, we 
demonstrate that a degree-dependent initial configuration can outperform the uniform case for both 
accuracy and personalization strength. 

PACS numbers: 89.75.Hc, 87.23.Ge, 05.70.Ln 



Introduction. — The exponential growth of the Inter- 
net [l| and World- Wide- Web 2] confronts people with 
an information overload: they are facing too many data 
and sources to be able to find out those most relevant 
for them. Thus far, the most promising way to effi- 
ciently filter out the information overload is to provide 
personal recommendations. That is to say, using the per- 
sonal information of a user (i.e., the historical track of 
this user's activities) to uncover his habits and to con- 
sider them in the recommendation. For instances, Ama- 
zon.com uses one's purchase history to provide individual 
suggestions. If you have bought a textbook on statistical 
physics, Amazon may recommend you some other sta- 
tistical physics books. Based on the well-developed Web 
2.0 technology, recommendation systems are frequently 
used in web-based movie-sharing (music-sharing, book- 
sharing, etc.) systems, web-based selling systems, and 
so on. Motivated by the significance to the economy 
and society, recommendation algorithms are being ex- 
tensively investigated in the engineering community 
Various kinds of algorithms have been proposed, includ- 
ing correlation-based methods [IJB , content-based meth- 
ods @, .the spectral analysis [q] , principle component 
analysis 9[, and so on. 

Very recently sorae physical dynamics, including heat 
conduction process [lO| and mass diffusion have 
found applications in personal recommendation. These 
physical approaches have been demonstrated to be both 
highly efficient and of low computational complexity 
[13, m El- In this paper, we introduce a network- 
based recommendation algorithm with degree-dependent 
initial configuration. Compared with uniform initial con- 
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figuration, the prediction accuracy can be remarkably 
enhanced by using the degree-dependent configuration. 
More significantly, besides the prediction accuracy, we 
present novel measurements to judge how personal the 
recommendation results are. The algorithm providing 
more personal recommendations has, in principle, greater 
ability to uncover the individual habits. Since main- 
stream interests are more easily uncovered, a user may 
appreciate a system more if it can recommend the un- 
popular objects he/she enjoys. Therefore, we argue that 
those two kinds of measurements, accuracy and degree of 
personalization, are complementary to each other in eval- 
uating a recommendation algorithm. Numerical simula- 
tions show that the optimal initial configuration subject 
to accuracy can also generate more personal recommen- 
dations. 

Method. — A recommendation system consists of users 
and objects, and each user has collected some objects. 
Denoting the object-set as O = {oi, 02, • • • , o„} and user- 
set as U = {ui,U2, ■ ■ ■ ,itm}, the recommendation sys- 
tem can be fully described by an n x m adjacent ma- 
trix A = {aij}, where a^- = 1 if Oi is collected by Uj, 
and fly = otherwise. A reasonable assumption is that 
the objects you have collected are what you like, and a 
recommendation algorithm aims at predicting your per- 
sonal opinions (to what extent you like or hate them) on 
those objects you have not yet collected. Mathematically 
speaking, for a given user, a recommendation algorithm 
generates a ranking of all the objects he/she has not col- 
lected before. The top L objects are recommended to 
this user, with L the length of the recommendation list. 

Based on the user-object relations A, an object net- 
work can be constructed, where each node represents an 
object, and two objects are connected if and only if they 
have been collected simultaneously by at least one user. 
We assume a certain amount of resource (e.g. recom- 
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FIG. 1: (Color online) The ranking score (r) vs. (5. The 
optimal /3, corresponding to the minimal (r) « 0.098, is /3opt ~ 
—0.8. All the data points shown in the main plot is obtained 
by averaging over five independent runs with different data- 
set divisions. The inset shows the numerical results of every 
separate run, where each curve represents one random division 
of data-set. 

mendation power) is associated vifith each object, and 
the weight wij represents the proportion of the resource 
Oj would hke to distribute to o^. For example, in the 
book-selling system, the weight Wij contributes to the 
strength of book Oi recommendation to a customer pro- 
vided he has bought book Oj . Following a network-based 
resource-allocation process where each object distributes 
its initial resource equally to all the users who have col- 
lected it, and then each user sends back what he/she 
has received to all the objects he/she has collected (also 
equally), the weight Wij (the fraction of initial resource 
Oj eventually gives to Oi) can be expressed as: 

where k{oj) = X]r=i ^^'-^ k{ui) = J^^i^^n denote 
the degrees of object Oj and u/, respectively. Clearly, the 
weight between two unconnected objects is zero. Accord- 
ing to the definition of the weighted matrix W = {wij}, 
if the initial resource vector is f , the final resource distri- 
bution is f = Wf. 

The general framework of the proposed network-based 
recommendation is as follows: (i) construct the weighted 
object network (i.e. determine the matrix W) from the 
known user-object relations; (ii) determine the initial re- 
source vector f for each user; (iii) get the final resource 
distribution via f — VFf; (iv) recommend those uncol- 
lected objects with highest final resource. Note that the 
initial configuration f is determined by the user's per- 
sonal information, thus for different users, the initial con- 
figuration is different. From now on, for a given user Ui, 
we use P to emphasize this personal configuration. 



FIG. 2: (Color online) The average degree of all recommended 
movies vs. /3. The black solid, red dash and blue dot curves 
represent the cases with typical lengths L = 10, 50 and 100, 
respectively. All the data points are obtained by averaging 
over five independent runs with different data-set divisions. 

Numerical results. — For a given user Ui, the jth el- 
ement of P should be zero if aji = 0. That is to say, 
one should not put any recommendation power (i.e. re- 
source) onto an uncollected object. The simplest case is 
to set a uniform initial configuration as 

= (2) 

Under this configuration, all the objects collected by Ui 
have the same recommendation power. In despite of its 
simplicity, it can outperform the two most widely ap- 
plied recommendation algorithms, global ranking method 
(CRM) tl3|] and collaborative filtering (CF) [3. 

To test the algorithmic accuracy, we use a benchmark 
data-set, namely MovieLens The data consists of 

1682 movies (objects) and 943 users, and users vote 
movies using discrete ratings 1-5. We therefore applied a 
coarse-graining method similar to that used in Ref . : 
a movie has been collected by a user if and only if the 
giving rating is at least 3 (i.e. the user at least likes this 
movie). The original data contains 10^ ratings, 85.25% 
of which are > 3, thus after coarse gaining the data con- 
tains 85250 user-object pairs. To test the recommenda- 
tion algorithms, the data set is randomly divided into 
two parts: The training set contains 90% of the data, 
and the remaining 10% of data constitutes the probe. 
The training set is treated as known information, while 
no information in the probe set is allowed to be used for 
prediction. 

A recommendation algorithm should provide each user 
with an ordered queue of all its uncollected objects. For 
an arbitrary user Ui, if the relation Ui — Oj is in the probe 
set (according to the training set, oj is an uncollected ob- 
ject for Ui), we measure the position of Oj in the ordered 
queue. For example, if there are 1000 uncollected movies 
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FIG. 3: (Color online) S vs. /3. The black solid, red dash and 
blue dot curves represent the cases with typical lengths L — 
10, 50 and 100, respectively. All the data points are obtained 
by averaging over five independent runs with different data- 
set divisions. 



for Ui, and Oj is the 10th from the top, we say the position 
of Oj is 10/1000, denoted by = 0.01. Since the probe 
entries are actually collected by users, a good algorithm 
is expected to give high recommendations to them, thus 
leading to small r. Therefore, the mean value of the po- 
sition value (r) (called ranking score ,11.]). averaged over 
all the entries in the probe, can be used to evaluate the 
algorithmic accuracy: the smaller the ranking score, the 
higher the algorithmic accuracy, and vice verse. Imple- 
menting the three algorithms mentioned above, the av- 
erage values of ranking scores over five independent runs 
(one run here means an independently random division 
of data set) are 0.107, 0.122, and 0.140 for network-based 
recommendation, collaborative filtering, and global rank- 
ing method, respectively. Clearly, even under the sim- 
plest initial configuration, subject to the algorithmic ac- 
curacy, the network-based recommendation outperforms 
the other two algorithms. 

Consider the initial resource located on object as its 
assigned recommendation power. In the whole recom- 
mendation process, the total power given to Oi is pi = 
X] j fi ^ where the superscript j runs over all the users 
Uj. Under uniform initial configuration (see Eq. (2)), 

the total power of Oi is pi = J2j fi = ^j^-ij = k{oi). 
That is to say, the total recommendation power assigned 
to an object is proportional to its degree, thus the im- 
pact of high-degree objects (e.g. popular movies) is en- 
hanced. Although it already has a good algorithmic ac- 
curacy, this uniform configuration may be oversimplified, 
and depressing the impact of high-degree objects in an 
appropriate way could, perhaps, further improve the ac- 
curacy. Motivated by this, we propose a more compli- 



cated distribution of initial resource to replace Eq. (2): 

f] = a,,k^{o,), (3) 

where /3 is a tunable parameter. Compared with the uni- 
form case, /3 = 0, a positive f3 strengthens the influence 
of large-degree objects, while a negative /3 weakens the 
influence of large-degree objects. In particular, the case 
f3 = —1 corresponds to an identical allocation of recom- 
mendation power {pi — 1) for each object o^. 

Fig. 1 reports the algorithmic accuracy as a function 
of /3. The curve has a clear minimum around /3 = —0.8. 
Compared with the uniform case, the ranking score can 
be further reduced by 9% at the optimal value. It is 
indeed a great improvement for recommendation algo- 
rithms. Note that /3opt is close to -1, which indicates 
that the more homogeneous distribution of recommenda- 
tion power among objects may lead to a more accurate 
prediction. 

Besides accuracy, another significant ingredient one 
should take into account to for a personal recommen- 
dation algorithm is how personal this algorithm is. For 
example, suppose there are 10 perfect movies not yet 
known for user u^, 8 of which are widely popular, while 
the other two fit a certain specific taste of Ui. An algo- 
rithm recommending the 8 popular movies is very nice for 
Ui , but he may feel even better about a recommendation 
list containing those two unpopular movies. Since there 
are countless channels to obtain information on popu- 
lar movies (TV, the Internet, newspapers, radio, etc.), 
uncovering very specific preference, corresponding to un- 
popular objects, is much more significant than simply 
picking out what a user likes from the top of the list. 
To measure this factor, we go simultaneously in two di- 
rections. Firstly, given the length L of recommendation 
list, the popularity can be measured directly by averag- 
ing the degree (fc) over all the recommended objects. One 
can see from Fig. 2 that the average degree is positively 
correlated with /3, thus depressing the recommendation 
power of high-degree objects gives more opportunity to 
unpopular objects. Also for L = 10, 50 and 100, the 
corresponding (fc) are 353.50, 258.00 and 214.09 (CRM), 
as well as 84.62, 87.95 and 83.79 (CF). Since CRM al- 
ways recommends the most popular objects, it is clear 
that (fc)GRM is the largest. On the other hand, CF mainly 
depends the similarity between users. Thus one user may 
be recommended an object collected by another user hav- 
ing very similar habits to him, even though this object 
may be very unpopular. This is the reason why (fc)cF is 
the smallest. Secondly, one can measure the strength of 
personalization via the Hamming distance. If the over- 
lapped number of objects in Ui and Uj's recommendation 
lists is Q, their Hamming distance is Hij = 1 — Q/L. 
Generally speaking, a more personal recommendation list 
should have larger Hamming distances to other lists. Ac- 
cordingly, we use the mean value of Hamming distance 
S = (Hij), averaged over all the user- user pairs, to mea- 
sure the strength of personalization. Fig. 3 plots S vs. 
(3 and, in accordance with the numerical results shown 
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in Fig. 2, depressing the influence of high-degree objects 
makes the recommendations more personaL For L = 10, 
50 and 100, the corresponding S are 0.508, 0.397 and 
0.337 (GRM), as well as 0.654, 0.501 and 0.421 (CF). 
Note that, Swu is obviously larger than zero, because 
the collected objects will not appear in the recommen- 
dation list, thus different users have different recommen- 
dation lists. Since CF has the potential to enhance the 
user-user similarity, Sc^ is remarkably smaller than that 
corresponding to negative (3 in network-based recommen- 
dation. 

In a word, without any increase in the algorithmic com- 
plexity, using an appropriate negative [3 in our algorithm 
outperforms the uniform case (i.e. (3 — Q) for all three 
criteria: more accurate, less popular, and more person- 
alized. 

Conclusions. — In this paper, we propose a recom- 
mendation algorithm based on a weighted object net- 
work. This algorithm is sensitive to the configuration 
of initial resource distribution. Even under the simplest 
case with binary resource, the current algorithm has re- 
markably higher accuracy than the widely applied GRM 
and CF. Since the computational complexity of this al- 
gorithm is much less than that of CF J/Tj, it has great 
potential significance in practice. Furthermore, we intro- 
duce a free parameter (3 to regulate the initial configu- 
ration of resource. Numerical results indicate that de- 
creasing the initial resource located on popular objects 



further improves the algorithmic accuracy: In the op- 
timal case (/3opt ~ —0.8), the distribution of total ini- 
tial resource located on each object is very homogeneous 
{pi ~ fc°'^(oi)). Besides the ranking score, there have 
been many measures suggested to evaluate the accu racy 
of personal recommendation algorithms [ll|j EM [l^, |20| |. 
including hitting rate, precision, recall, F-measure, and 
so on. However, thus far, there has been no considera- 
tion of the degree of personalization. In this paper, we 
suggest two measures, (k) and S, to address this issue. 
We argue that to evaluate the performance of a recom- 
mendation algorithm, one should take into account not 
only the accuracy, but also the degree of personalization 
and popularity of recommended objects. Even under this 
more strict criterion, the case with Popt ~ —0.8 outper- 
forms the uniform case. Theoretical physics provides us 
some beautiful and powerful tools in dealing with this 
long-standing challenge in modern information science: 
how to do a personal recommendation. Wc believe the 
current work can enlighten readers in this interesting di- 
rection. 
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